29 research outputs found
Efficiency Analysis of Competing Tests for Finding Differentially Expressed Genes in Lung Adenocarcinoma
In this study, we introduce and use Efficiency Analysis to compare differences in the apparent internal and external consistency of competing normalization methods and tests for identifying differentially expressed genes. Using publicly available data, two lung adenocarcinoma datasets were analyzed using caGEDA (http://bioinformatics2.pitt.edu/GE2/GEDA.html) to measure the degree of differential expression of genes existing between two populations. The datasets were randomly split into at least two subsets, each analyzed for differentially expressed genes between the two sample groups, and the gene lists compared for overlapping genes. Efficiency Analysis is an intuitive method that compares the differences in the percentage of overlap of genes from two or more data subsets, found by the same test over a range of testing methods. Tests that yield consistent gene lists across independently analyzed splits are preferred to those that yield less consistent inferences. For example, a method that exhibits 50% overlap in the 100 top genes from two studies should be preferred to a method that exhibits 5% overlap in the top 100 genes. The same procedure was performed using all available normalization and transformation methods that are available through caGEDA. The ‘best’ test was then further evaluated using internal cross-validation to estimate generalizable sample classification errors using a Naïve Bayes classification algorithm. A novel test, termed D1 (a derivative of the J5 test) was found to be the most consistent, and to exhibit the lowest overall classification error, and highest sensitivity and specificity. The D1 test relaxes the assumption that few genes are differentially expressed. Efficiency Analysis can be misleading if the tests exhibit a bias in any particular dimension (e.g. expression intensity); we therefore explored intensity-scaled and segmented J5 tests using data in which all genes are scaled to share the same intensity distribution range. Efficiency Analysis correctly predicted the ‘best’ test and normalization method using the Beer dataset and also performed well with the Bhattacharjee dataset based on both efficiency and classification accuracy criteria
Recommended from our members
Large-scale mapping of mutations affecting zebrafish development
BACKGROUND: Large-scale mutagenesis screens in the zebrafish employing the mutagen ENU have isolated several hundred mutant loci that represent putative developmental control genes. In order to realize the potential of such screens, systematic genetic mapping of the mutations is necessary. Here we report on a large-scale effort to map the mutations generated in mutagenesis screening at the Max Planck Institute for Developmental Biology by genome scanning with microsatellite markers. RESULTS: We have selected a set of microsatellite markers and developed methods and scoring criteria suitable for efficient, high-throughput genome scanning. We have used these methods to successfully obtain a rough map position for 319 mutant loci from the Tübingen I mutagenesis screen and subsequent screening of the mutant collection. For 277 of these the corresponding gene is not yet identified. Mapping was successful for 80 % of the tested loci. By comparing 21 mutation and gene positions of cloned mutations we have validated the correctness of our linkage group assignments and estimated the standard error of our map positions to be approximately 6 cM. CONCLUSION: By obtaining rough map positions for over 300 zebrafish loci with developmental phenotypes, we have generated a dataset that will be useful not only for cloning of the affected genes, but also to suggest allelism of mutations with similar phenotypes that will be identified in future screens. Furthermore this work validates the usefulness of our methodology for rapid, systematic and inexpensive microsatellite mapping of zebrafish mutations
The James Webb Space Telescope Mission
Twenty-six years ago a small committee report, building on earlier studies,
expounded a compelling and poetic vision for the future of astronomy, calling
for an infrared-optimized space telescope with an aperture of at least .
With the support of their governments in the US, Europe, and Canada, 20,000
people realized that vision as the James Webb Space Telescope. A
generation of astronomers will celebrate their accomplishments for the life of
the mission, potentially as long as 20 years, and beyond. This report and the
scientific discoveries that follow are extended thank-you notes to the 20,000
team members. The telescope is working perfectly, with much better image
quality than expected. In this and accompanying papers, we give a brief
history, describe the observatory, outline its objectives and current observing
program, and discuss the inventions and people who made it possible. We cite
detailed reports on the design and the measured performance on orbit.Comment: Accepted by PASP for the special issue on The James Webb Space
Telescope Overview, 29 pages, 4 figure
Journal of Proteomics & Bioinformatics- Open Access www.omicsonline.com Research Article JPB/Vol.2/June 2009 Optimization of the Use of Consensus Methods for the Detection and Putative Identification of Peptides via Mass Spectrometry Using Protein Standar
Copyright: © 2009 Sultana T, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Correct identification of peptides and proteins in complex biological samples from proteomic mass-spectra is a challenging problem in bioinformatics. The sensitivity and specificity of identification algorithms depend on underlying scoring methods, some being more sensitive, and others more specific. For high-throughput, automated peptide identification, control over the algorithm 's performance in terms of trade-off between sensitivity and specificity is desirable. Combinations of algorithms, called ‘consensus methods’, have been shown to provide more accurate results than individual algorithms. However, due to the proliferation of algorithms and their varied internal settings, a systematic understanding of relative performance of individual and consensus methods are lacking. We performed an in-depth analysis of various approaches to consensus scoring using known protein mixtures, and evaluated the performance of 2310 settings generated from consensus of three different search algorithms: Mascot, Sequest, and X!Tandem. Our findings indicate that the union of Mascot, Sequest